Options

############################################################
##             RDA (Linear ordination, Correlations, Euclidean distance)
##             rda()
##            /  \                              
##Unconstrained   Constrained ---> ordination                    
##   (PCA)            (RDA)   ---> anova
##
##             tb-RDA (Transformation-based ordination, Hellinger distance)
##             rda()
##            /  \                              
##Unconstrained   Constrained ---> ordination                    
##   (PCA)            (RDA)   ---> anova
##
##              CA (Unimodal ordination, Chisq distance)
##              cca()
##             /  \
##Unconstrained   Constrained ---> ordination
## (CA)             (CCA)     ---> anova
##
##              DCA (Unimodal ordination, Chisq distance)
##             decorana()
##             /  \
##Unconstrained   Constrained ---> ordination
## (DCA)             (DCCA)     ---> anova
##
##             PCoA, dbRDA (metric MDS, any distance)
##             capscale()
##             /  \
##Unconstrained   Constrained ---> ordination
##                            ---> anova
##
##Unconstrained  ---> ordination
##               ---> envfit (overlay enviromental data) (permutation test)
##               ---> lm/glm etc (response or predictor)
#############################################################
##     Dissimilarity
##            --> MDS/nMDS      ---> ordination
##            --> bioenv   ---> variable importance       (perm test)
##            --> adonis2* ---> anova                     (perm test)
##            --> simper   ---> similarity percentages
##            --- betadisp ---> homogeneity of dispersion (perm test)
#############################################################
##     Model based ordination
##            ---> glmmTMB (via reduced rank / latent variable)
##            ---> gllvm (generalised latent variable models)
#############################################################
##     Model based
##            ---> manyglm ---> anova
##            ---> gllvm (generalized latent variable models)
#############################################################

Preparations

Load the necessary libraries

library(tidyverse) # for data wrangling etc
library(vegan)
library(ggvegan)
library(ggrepel)
library(GGally)
library(corrplot)
library(EcolUtils)
library(car)
library(scales)
library(patchwork)

Scenario

Smeenk-Enserink and AART (1974) was interested in the environmental drivers of hunting spider distributions within dunal habitats in the Netherlands. To explore, this, they measured the abundance of 12 species of hunting spider from 28 sites along a number of environmental gradients.

We will explore the patterns of hunting spider species composition using linear ordination techniques. These are linear as they assume that the individual abundances change in a linear manner over an environmental gradient.

Read in the data

spider.abund <- read_csv(file = "../public/data/spider.abund.csv", trim_ws = TRUE)
spider.abund %>% glimpse()
## Rows: 28
## Columns: 12
## $ Alopacce <dbl> 25, 0, 15, 2, 1, 0, 2, 0, 1, 3, 15, 16, 3, 0, 0, 0, 0, 0, 0, …
## $ Alopcune <dbl> 10, 2, 20, 6, 20, 6, 7, 11, 1, 0, 1, 13, 43, 2, 0, 3, 0, 1, 1…
## $ Alopfabr <dbl> 0, 0, 2, 0, 0, 0, 0, 0, 0, 1, 2, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0…
## $ Arctlute <dbl> 0, 0, 2, 1, 2, 6, 12, 0, 0, 0, 0, 0, 2, 1, 0, 0, 0, 0, 0, 0, …
## $ Arctperi <dbl> 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0…
## $ Auloalbi <dbl> 4, 30, 9, 24, 9, 6, 16, 7, 0, 0, 1, 0, 18, 4, 0, 0, 0, 0, 0, …
## $ Pardlugu <dbl> 0, 1, 1, 1, 1, 0, 1, 55, 0, 0, 0, 0, 1, 3, 6, 6, 2, 5, 12, 13…
## $ Pardmont <dbl> 60, 1, 29, 7, 2, 11, 30, 2, 26, 22, 95, 96, 24, 14, 0, 0, 0, …
## $ Pardnigr <dbl> 12, 15, 18, 29, 135, 27, 89, 2, 1, 0, 0, 1, 53, 15, 0, 2, 0, …
## $ Pardpull <dbl> 45, 37, 45, 94, 76, 24, 105, 1, 1, 0, 1, 8, 72, 72, 0, 0, 0, …
## $ Trocterr <dbl> 57, 65, 66, 86, 91, 63, 118, 30, 2, 1, 4, 13, 97, 94, 25, 28,…
## $ Zoraspin <dbl> 4, 9, 1, 25, 17, 34, 16, 3, 0, 0, 0, 0, 22, 32, 3, 4, 2, 0, 3…

Exploratory data analysis

correlation plots

spider.abund %>%
  cor() %>%
  corrplot(
    type = "upper",
    diag = FALSE
  )

## And now with axes arrange according to first princomp
spider.abund %>%
  cor() %>%
  corrplot(
    type = "upper",
    order = "FPC",
    diag = FALSE
  )

Scatterplot matrices

spider.abund %>%
  ggpairs(
    lower = list(continuous = "smooth"),
    diag = list(continuous = "density"),
    axisLabels = "show"
  )

## - clearly not normal

spider.abund^0.25 %>%
  ggpairs(
    lower = list(continuous = "smooth"),
    diag = list(continuous = "density"),
    axisLabels = "show"
  )

## - still not normal

DCA (detrended correspondance analysis)

spider.dca <- decorana(spider.abund^0.25)
spider.dca
## 
## Call:
## decorana(veg = spider.abund^0.25) 
## 
## Detrended correspondence analysis with 26 segments.
## Rescaling of axes with 4 iterations.
## 
##                   DCA1   DCA2    DCA3    DCA4
## Eigenvalues     0.4426 0.1822 0.10525 0.08791
## Decorana values 0.5078 0.1026 0.02482 0.01544
## Axis lengths    2.6739 1.6593 1.50636 1.22921

Conclusions:

There is evidence of non-normality and non-linearity although we could attempt to normalize via sqrt transform, this is unlikely to fix all.vars linearity also not good. It is likely that these issues are the result of the sampling occuring over a larger scale than the natural range of the taxa. This will result in some species being left skewed and others being right skewed. It will also result in non-linearity and the horseshoe effect.

Technically, these techniques also assume homogeneity of variance. However, it is not possible to explore residuals from these techniques.

Evidence of linearity and non-normality Solutions:

  • Metric ordination
  1. PCA - ignore at own peril
  2. hellinger trasformation (tb-PCA)
  3. CA (CCA)
  4. Distance-based RDA (capscale)
  • Non-metric
  1. nMDS

Scaling/transformations

spider.abund %>% head()
## Standardize columns to maximums
spider.abund %>%
  head() %>%
  decostand(method = "max", MARGIN = 2)
## spider.abund %>%
##     head() %>%
##     decostand(method = "range", MARGIN = 2)

## spider.abund %>%
##     head() %>%
##     decostand(method = "normalize", MARGIN = 2)

## Standardize rows to totals
spider.abund %>%
  head() %>%
  decostand(method = "total", MARGIN = 1)
## Double standardization
spider.abund %>%
  head() %>%
  wisconsin()
## Transformations
spider.abund %>%
  sqrt() %>%
  head() %>%
  wisconsin()

PCA

Raw data

spider.rda <- rda(spider.abund, scale = TRUE)
summary(spider.rda, display = NULL)
## 
## Call:
## rda(X = spider.abund, scale = TRUE) 
## 
## Partitioning of correlations:
##               Inertia Proportion
## Total              12          1
## Unconstrained      12          1
## 
## Eigenvalues, and their contribution to the correlations 
## 
## Importance of components:
##                         PC1    PC2    PC3     PC4     PC5     PC6     PC7
## Eigenvalue            5.112 1.8297 1.4590 0.87110 0.66292 0.57872 0.50456
## Proportion Explained  0.426 0.1525 0.1216 0.07259 0.05524 0.04823 0.04205
## Cumulative Proportion 0.426 0.5785 0.7000 0.77264 0.82788 0.87611 0.91816
##                           PC8     PC9    PC10     PC11     PC12
## Eigenvalue            0.39463 0.26149 0.19786 0.106973 0.021166
## Proportion Explained  0.03289 0.02179 0.01649 0.008914 0.001764
## Cumulative Proportion 0.95104 0.97283 0.98932 0.998236 1.000000
## 
## Scaling 2 for species and site scores
## * Species are scaled proportional to eigenvalues
## * Sites are unscaled: weighted dispersion equal on all dimensions
## * General scaling constant of scores:

Screeplot

screeplot(spider.rda)
abline(a = 1, b = 0)

Biplot

## Quick and nasty ordination plots
biplot(spider.rda, scaling = "species")

biplot(spider.rda, scaling = "sites")

Ordiplot

## Quick and nasty ordination plots
pl <- vegan::ordiplot(spider.rda)
points(pl, "sites", pch = 21, col = "red", bg = "yellow")
text(pl, "sites", col = "red", cex = 0.9)
text(pl, "species", col = "blue", cex = 0.9)

## line(pl, "sites")

Autoplot

## ggvegan provides ways to use ggplot
## library(devtools)
## devtools::install_github("gavinsimpson/ggvegan")
## library(ggvegan)
autoplot(spider.rda)

autoplot(spider.rda) + theme_bw()

autoplot(spider.rda, geom = "text") + theme_bw()

Manually

spider.rda.scores <- spider.rda %>%
  fortify()
spider.rda.scores
g <-
  ggplot(data = NULL, aes(y = PC2, x = PC1)) +
  geom_hline(yintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = 0, linetype = "dotted") +
  geom_point(data = spider.rda.scores %>% filter(Score == "sites")) +
  geom_text(
    data = spider.rda.scores %>% filter(Score == "sites"),
    aes(label = Label), hjust = -0.2
  ) +
  geom_segment(
    data = spider.rda.scores %>% filter(Score == "species"),
    aes(y = 0, x = 0, yend = PC2, xend = PC1),
    arrow = arrow(length = unit(0.3, "lines")), color = "red"
  ) +
  ## geom_text(data=spider.rda.scores %>% filter(Score=='species'),
  ##           aes(y=PC2*1.1, x=PC1*1.1, label=Label), color='red') +
  geom_text_repel(
    data = spider.rda.scores %>% filter(Score == "species"),
    aes(y = PC2 * 1.1, x = PC1 * 1.1, label = Label), color = "red"
  ) +
  theme_bw()
g

# Nice axes titles
eig <- eigenvals(spider.rda)

paste(names(eig[2]), sprintf("(%0.1f%% explained var.)", 100 * eig[2] / sum(eig)))
## [1] "PC2 (15.2% explained var.)"
g <- g +
  scale_y_continuous(paste(names(eig[2]), sprintf(
    "(%0.1f%% explained var.)",
    100 * eig[2] / sum(eig)
  ))) +
  scale_x_continuous(paste(names(eig[1]), sprintf(
    "(%0.1f%% explained var.)",
    100 * eig[1] / sum(eig)
  )))

# put a circle
circle.prob <- 0.68
## circle.prob <- 0.95
## circle.prob <- 0.95
r <- sqrt(qchisq(circle.prob, df = 2)) * prod(colMeans(spider.rda$CA$u[, 1:2]^2))^(1 / 4)
theta <- c(seq(-pi, pi, length = 50), seq(pi, -pi, length = 50))
circle <- data.frame(PC1 = r * cos(theta), PC2 = r * sin(theta))
g <- g + geom_path(data = circle, aes(y = PC2, x = PC1), color = muted("white"), size = 1 / 2, alpha = 1 / 3)
g

g1 <- g

Standardized data

spider.std <- spider.abund %>%
  mutate(across(everything(), function(x) x^0.25)) %>%
  wisconsin()
spider.std
spider.rda <- rda(spider.std, scale = FALSE)
summary(spider.rda, display = NULL)
## 
## Call:
## rda(X = spider.std, scale = FALSE) 
## 
## Partitioning of variance:
##               Inertia Proportion
## Total          0.1008          1
## Unconstrained  0.1008          1
## 
## Eigenvalues, and their contribution to the variance 
## 
## Importance of components:
##                           PC1     PC2      PC3      PC4      PC5      PC6
## Eigenvalue            0.05808 0.02161 0.008713 0.004713 0.002107 0.001838
## Proportion Explained  0.57620 0.21437 0.086440 0.046758 0.020908 0.018235
## Cumulative Proportion 0.57620 0.79057 0.877010 0.923768 0.944676 0.962911
##                            PC7      PC8       PC9      PC10      PC11
## Eigenvalue            0.001335 0.001075 0.0006611 0.0004842 0.0001827
## Proportion Explained  0.013247 0.010665 0.0065593 0.0048041 0.0018125
## Cumulative Proportion 0.976159 0.986824 0.9933834 0.9981875 1.0000000
## 
## Scaling 2 for species and site scores
## * Species are scaled proportional to eigenvalues
## * Sites are unscaled: weighted dispersion equal on all dimensions
## * General scaling constant of scores:

Manually

spider.rda.scores <- spider.rda %>%
  fortify()
spider.rda.scores
g <-
  ggplot(data = NULL, aes(y = PC2, x = PC1)) +
  geom_hline(yintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = 0, linetype = "dotted") +
  geom_point(data = spider.rda.scores %>% filter(Score == "sites")) +
  geom_text(
    data = spider.rda.scores %>% filter(Score == "sites"),
    aes(label = Label), hjust = -0.2
  ) +
  geom_segment(
    data = spider.rda.scores %>% filter(Score == "species"),
    aes(y = 0, x = 0, yend = PC2, xend = PC1),
    arrow = arrow(length = unit(0.3, "lines")), color = "red"
  ) +
  ## geom_text(data=spider.rda.scores %>% filter(Score=='species'),
  ##           aes(y=PC2*1.1, x=PC1*1.1, label=Label), color='red') +
  geom_text_repel(
    data = spider.rda.scores %>% filter(Score == "species"),
    aes(y = PC2 * 1.1, x = PC1 * 1.1, label = Label), color = "red"
  ) +
  theme_bw()
g

# Nice axes titles
eig <- eigenvals(spider.rda)

paste(names(eig[2]), sprintf("(%0.1f%% explained var.)", 100 * eig[2] / sum(eig)))
## [1] "PC2 (21.4% explained var.)"
g <- g +
  scale_y_continuous(paste(names(eig[2]), sprintf(
    "(%0.1f%% explained var.)",
    100 * eig[2] / sum(eig)
  ))) +
  scale_x_continuous(paste(names(eig[1]), sprintf(
    "(%0.1f%% explained var.)",
    100 * eig[1] / sum(eig)
  )))

# put a circle
circle.prob <- 0.68
## circle.prob <- 0.95
## circle.prob <- 0.95
r <- sqrt(qchisq(circle.prob, df = 2)) * prod(colMeans(spider.rda$CA$u[, 1:2]^2))^(1 / 4)
theta <- c(seq(-pi, pi, length = 50), seq(pi, -pi, length = 50))
circle <- data.frame(PC1 = r * cos(theta), PC2 = r * sin(theta))
g <- g + geom_path(data = circle, aes(y = PC2, x = PC1), color = muted("white"), size = 1 / 2, alpha = 1 / 3)
g

g2 <- g

tb-PCA data

Transformation-based PCA

spider.hel <- spider.abund %>%
  decostand(method = "hellinger")
spider.rda <- rda(spider.hel, scale = FALSE)
summary(spider.rda, display = NULL)
## 
## Call:
## rda(X = spider.hel, scale = FALSE) 
## 
## Partitioning of variance:
##               Inertia Proportion
## Total          0.4939          1
## Unconstrained  0.4939          1
## 
## Eigenvalues, and their contribution to the variance 
## 
## Importance of components:
##                          PC1    PC2     PC3     PC4     PC5      PC6     PC7
## Eigenvalue            0.2481 0.1192 0.06421 0.01590 0.01448 0.008943 0.00784
## Proportion Explained  0.5023 0.2413 0.13002 0.03219 0.02932 0.018109 0.01587
## Cumulative Proportion 0.5023 0.7436 0.87365 0.90584 0.93516 0.953270 0.96914
##                            PC8      PC9     PC10     PC11     PC12
## Eigenvalue            0.004929 0.003948 0.003284 0.002067 0.001010
## Proportion Explained  0.009981 0.007994 0.006649 0.004186 0.002046
## Cumulative Proportion 0.979125 0.987120 0.993769 0.997954 1.000000
## 
## Scaling 2 for species and site scores
## * Species are scaled proportional to eigenvalues
## * Sites are unscaled: weighted dispersion equal on all dimensions
## * General scaling constant of scores:

Manually

spider.rda.scores <- spider.rda %>%
  fortify()
spider.rda.scores
g <-
  ggplot(data = NULL, aes(y = PC2, x = PC1)) +
  geom_hline(yintercept = 0, linetype = "dotted") +
  geom_vline(xintercept = 0, linetype = "dotted") +
  geom_point(data = spider.rda.scores %>% filter(Score == "sites")) +
  geom_text(
    data = spider.rda.scores %>% filter(Score == "sites"),
    aes(label = Label), hjust = -0.2
  ) +
  geom_segment(
    data = spider.rda.scores %>% filter(Score == "species"),
    aes(y = 0, x = 0, yend = PC2, xend = PC1),
    arrow = arrow(length = unit(0.3, "lines")), color = "red"
  ) +
  ## geom_text(data=spider.rda.scores %>% filter(Score=='species'),
  ##           aes(y=PC2*1.1, x=PC1*1.1, label=Label), color='red') +
  geom_text_repel(
    data = spider.rda.scores %>% filter(Score == "species"),
    aes(y = PC2 * 1.1, x = PC1 * 1.1, label = Label), color = "red"
  ) +
  theme_bw()
g

# Nice axes titles
eig <- eigenvals(spider.rda)

paste(names(eig[2]), sprintf("(%0.1f%% explained var.)", 100 * eig[2] / sum(eig)))
## [1] "PC2 (24.1% explained var.)"
g <- g +
  scale_y_continuous(paste(names(eig[2]), sprintf(
    "(%0.1f%% explained var.)",
    100 * eig[2] / sum(eig)
  ))) +
  scale_x_continuous(paste(names(eig[1]), sprintf(
    "(%0.1f%% explained var.)",
    100 * eig[1] / sum(eig)
  )))

# put a circle
circle.prob <- 0.68
## circle.prob <- 0.95
## circle.prob <- 0.95
r <- sqrt(qchisq(circle.prob, df = 2)) * prod(colMeans(spider.rda$CA$u[, 1:2]^2))^(1 / 4)
theta <- c(seq(-pi, pi, length = 50), seq(pi, -pi, length = 50))
circle <- data.frame(PC1 = r * cos(theta), PC2 = r * sin(theta))
g <- g + geom_path(data = circle, aes(y = PC2, x = PC1), color = muted("white"), size = 1 / 2, alpha = 1 / 3)
g

g3 <- g

Comparison

g1 + g2 + g3

Environmental fit (envfit)

Read in the data

spider.env <- read_csv(file = "../public/data/spider.env.csv", trim_ws = TRUE)
spider.env %>% glimpse()
## Rows: 28
## Columns: 6
## $ soil.dry      <dbl> 2.3321, 3.0493, 2.5572, 2.6741, 3.0155, 3.3810, 3.1781, …
## $ bare.sand     <dbl> 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 2.3979, 0.0000, …
## $ fallen.leaves <dbl> 0.0000, 1.7918, 0.0000, 0.0000, 0.0000, 3.4340, 0.0000, …
## $ moss          <dbl> 3.0445, 1.0986, 2.3979, 2.3979, 0.0000, 2.3979, 0.6931, …
## $ herb.layer    <dbl> 4.4543, 4.5643, 4.6052, 4.6151, 4.6151, 3.4340, 4.6151, …
## $ reflection    <dbl> 3.9120, 1.6094, 3.6889, 2.9957, 2.3026, 0.6931, 2.3026, …

EDA

spider.env %>%
  as.data.frame() %>%
  ggpairs(
    lower = list(continuous = "smooth"),
    diag = list(continuous = "density"),
    axisLabels = "show"
  )

envfit

spider.envfit <- envfit(spider.rda, env = spider.env)
spider.envfit
## 
## ***VECTORS
## 
##                    PC1      PC2     r2 Pr(>r)    
## soil.dry       0.97715  0.21253 0.8749  0.001 ***
## bare.sand     -0.96446 -0.26422 0.5844  0.001 ***
## fallen.leaves  0.76789 -0.64058 0.8051  0.001 ***
## moss          -0.95547  0.29509 0.7559  0.001 ***
## herb.layer     0.08660  0.99624 0.5346  0.001 ***
## reflection    -0.90779  0.41943 0.8218  0.001 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## Permutation: free
## Number of permutations: 999
spider.env.scores <- spider.envfit %>% fortify()
g <- g +
  geom_segment(
    data = spider.env.scores,
    aes(y = 0, x = 0, yend = PC2, xend = PC1),
    arrow = arrow(length = unit(0.3, "lines")), color = "blue"
  ) +
  geom_text(
    data = spider.env.scores,
    aes(y = PC2 * 1.1, x = PC1 * 1.1, label = Label), color = "blue"
  )
g

(Generalized) linear models

Extract scores

pc1 <- spider.rda.scores %>%
  filter(Score == "sites") %>%
  pull(PC1)
pc2 <- spider.rda.scores %>%
  filter(Score == "sites") %>%
  pull(PC2)

Test variance inflation

lm(1:nrow(spider.env) ~ soil.dry + bare.sand + fallen.leaves +
  moss + herb.layer + reflection, data = spider.env) %>%
  vif()
##      soil.dry     bare.sand fallen.leaves          moss    herb.layer 
##      4.344354      2.278902      5.871100      2.668214      2.476987 
##    reflection 
##      6.039663
lm(1:nrow(spider.env) ~ herb.layer + fallen.leaves +
  bare.sand + moss, data = spider.env) %>%
  vif()
##    herb.layer fallen.leaves     bare.sand          moss 
##      2.063289      3.434232      1.583657      2.067230
lm(pc1 ~ herb.layer + fallen.leaves + bare.sand + moss, data = spider.env) %>%
  summary()
## 
## Call:
## lm(formula = pc1 ~ herb.layer + fallen.leaves + bare.sand + moss, 
##     data = spider.env)
## 
## Residuals:
##       Min        1Q    Median        3Q       Max 
## -0.225806 -0.069301 -0.004843  0.049484  0.298719 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)    
## (Intercept)    0.11667    0.15853   0.736 0.469187    
## herb.layer     0.04578    0.02781   1.646 0.113389    
## fallen.leaves  0.05523    0.02198   2.512 0.019457 *  
## bare.sand     -0.08686    0.01896  -4.580 0.000133 ***
## moss          -0.11919    0.02359  -5.053 4.09e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.1261 on 23 degrees of freedom
## Multiple R-squared:  0.8999, Adjusted R-squared:  0.8825 
## F-statistic:  51.7 on 4 and 23 DF,  p-value: 3.622e-11
lm(pc2 ~ herb.layer + fallen.leaves + bare.sand + moss, data = spider.env) %>%
  summary()
## 
## Call:
## lm(formula = pc2 ~ herb.layer + fallen.leaves + bare.sand + moss, 
##     data = spider.env)
## 
## Residuals:
##      Min       1Q   Median       3Q      Max 
## -0.35751 -0.14768  0.01241  0.07293  0.51164 
## 
## Coefficients:
##               Estimate Std. Error t value Pr(>|t|)  
## (Intercept)   -0.33808    0.28609  -1.182   0.2494  
## herb.layer     0.13905    0.05019   2.770   0.0109 *
## fallen.leaves -0.06925    0.03967  -1.746   0.0942 .
## bare.sand     -0.07796    0.03422  -2.278   0.0323 *
## moss           0.03752    0.04257   0.882   0.3872  
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 0.2275 on 23 degrees of freedom
## Multiple R-squared:  0.6741, Adjusted R-squared:  0.6174 
## F-statistic: 11.89 on 4 and 23 DF,  p-value: 2.203e-05

Redundancy Analysis (RDA)

Constrained PCA

Prepare data

spider.std <- spider.abund %>%
  mutate(across(everything(), function(x) x^0.25)) %>%
  wisconsin()
spider.std

Fit RDA

spider.rda <- rda(spider.std ~
  scale(herb.layer) +
  scale(fallen.leaves) +
  scale(bare.sand) +
  scale(moss),
data = spider.env, scale = FALSE
)

Explore RDA

vif.cca(spider.rda)
##    scale(herb.layer) scale(fallen.leaves)     scale(bare.sand) 
##             2.063289             3.434232             1.583657 
##          scale(moss) 
##             2.067230
summary(spider.rda, display = NULL)
## 
## Call:
## rda(formula = spider.std ~ scale(herb.layer) + scale(fallen.leaves) +      scale(bare.sand) + scale(moss), data = spider.env, scale = FALSE) 
## 
## Partitioning of variance:
##               Inertia Proportion
## Total         0.10079     1.0000
## Constrained   0.06951     0.6896
## Unconstrained 0.03129     0.3104
## 
## Eigenvalues, and their contribution to the variance 
## 
## Importance of components:
##                          RDA1   RDA2     RDA3      RDA4     PC1     PC2
## Eigenvalue            0.05379 0.0127 0.002375 0.0006393 0.01006 0.00683
## Proportion Explained  0.53369 0.1260 0.023562 0.0063430 0.09978 0.06776
## Cumulative Proportion 0.53369 0.6597 0.683267 0.6896099 0.78939 0.85715
##                            PC3      PC4      PC5      PC6      PC7       PC8
## Eigenvalue            0.005833 0.002708 0.001679 0.001422 0.001047 0.0009151
## Proportion Explained  0.057866 0.026871 0.016654 0.014109 0.010391 0.0090788
## Cumulative Proportion 0.915017 0.941887 0.958541 0.972650 0.983042 0.9921205
##                             PC9      PC10      PC11
## Eigenvalue            0.0004719 0.0002206 0.0001017
## Proportion Explained  0.0046815 0.0021890 0.0010091
## Cumulative Proportion 0.9968019 0.9989909 1.0000000
## 
## Accumulated constrained eigenvalues
## Importance of components:
##                          RDA1   RDA2     RDA3      RDA4
## Eigenvalue            0.05379 0.0127 0.002375 0.0006393
## Proportion Explained  0.77390 0.1827 0.034168 0.0091979
## Cumulative Proportion 0.77390 0.9566 0.990802 1.0000000
## 
## Scaling 2 for species and site scores
## * Species are scaled proportional to eigenvalues
## * Sites are unscaled: weighted dispersion equal on all dimensions
## * General scaling constant of scores:

Validation (goodness of fit)

Goodness of fit test

Cumulative proportion of inertia accounted for (explained) by species (or sites) up to the chosen RDA axes.

Often used to remove species from ordination (those not well fit).

goodness(spider.rda, display = "species")
##                RDA1      RDA2      RDA3      RDA4
## Alopacce 0.83750803 0.8380824 0.8507434 0.8508038
## Alopcune 0.37508883 0.3919829 0.4346845 0.4709308
## Alopfabr 0.71308644 0.8330363 0.8401060 0.8418590
## Arctlute 0.01261829 0.1836158 0.3202399 0.3207326
## Arctperi 0.44818688 0.6150258 0.6212733 0.6305785
## Auloalbi 0.03451587 0.4000129 0.5119224 0.5291927
## Pardlugu 0.65510050 0.8043823 0.8112280 0.8120253
## Pardmont 0.37954864 0.4992717 0.5969931 0.6102758
## Pardnigr 0.08081868 0.2964765 0.3228101 0.3414144
## Pardpull 0.03055311 0.6247932 0.6249052 0.6306910
## Trocterr 0.58662722 0.6550687 0.6599862 0.6609695
## Zoraspin 0.57923767 0.5997179 0.6055232 0.6090685
goodness(spider.rda, display = "sites")
##               RDA1       RDA2       RDA3      RDA4
## row1  0.0524968824 0.66489912 0.68661136 0.6867522
## row2  0.4223428166 0.58887523 0.59910482 0.6009686
## row3  0.0014104962 1.17178602 1.17224850 1.1752071
## row4  0.0007365425 0.65428543 0.65452436 0.6560569
## row5  0.4219005774 0.81830721 0.99906989 1.0422787
## row6  0.0011298590 0.08754187 0.08766307 0.1540528
## row7  0.2096327312 0.66208875 0.75228508 0.7777694
## row8  1.4390550226 1.44029622 1.44368378 1.4659315
## row9  0.2187282376 0.33488298 0.42778146 0.4303627
## row10 0.7502662246 0.79439259 0.79812287 0.7998634
## row11 0.1957141107 0.41065463 0.49344886 0.4939953
## row12 0.1115623876 0.33843295 0.38543448 0.3854904
## row13 0.2285774470 0.24992517 0.50137563 0.5302574
## row14 0.0182942138 0.48640292 0.49450601 0.5013980
## row15 0.4681201993 0.47518991 0.47527035 0.4775999
## row16 0.8059684260 1.12437815 1.13493191 1.1484729
## row17 0.4714710739 0.47888453 0.47899210 0.4815764
## row18 0.5642036267 0.57392060 0.57624471 0.5768544
## row19 0.5351212205 0.65412775 0.68389913 0.6839012
## row20 0.5829017957 0.81356858 0.82117807 0.8311299
## row21 1.2249714616 1.40239040 1.40776690 1.4086559
## row22 1.0140638931 1.22857277 1.24525035 1.2567782
## row23 0.9521334029 0.95925339 0.96135532 0.9649602
## row24 0.5320168348 0.54321462 0.54398104 0.5452887
## row25 1.2216488014 1.27315318 1.87718068 1.8787120
## row26 0.4069732927 0.57833175 0.61028367 0.6152822
## row27 0.3122998697 0.31253661 0.31423345 0.3150001
## row28 0.7757451685 0.78608027 0.78646965 0.7910322

Proportion of inertia explained (decomposed) by constrained and unconstrained

inertcomp(spider.rda)
##                   CCA          CA
## Alopacce 0.0109479232 0.001919819
## Alopcune 0.0025228174 0.002834270
## Alopfabr 0.0132903683 0.002496561
## Arctlute 0.0006504174 0.001377495
## Arctperi 0.0071904603 0.004212498
## Auloalbi 0.0020658849 0.001837958
## Pardlugu 0.0113811900 0.002634617
## Pardmont 0.0050287681 0.003211389
## Pardnigr 0.0009975667 0.001924298
## Pardpull 0.0025351034 0.001484461
## Trocterr 0.0071863730 0.003686099
## Zoraspin 0.0057114190 0.003665883
## or proportionally
inertcomp(spider.rda, proportional = TRUE)
##                CCA        CA
## Alopacce 0.8508038 0.1491962
## Alopcune 0.4709308 0.5290692
## Alopfabr 0.8418590 0.1581410
## Arctlute 0.3207326 0.6792674
## Arctperi 0.6305785 0.3694215
## Auloalbi 0.5291927 0.4708073
## Pardlugu 0.8120253 0.1879747
## Pardmont 0.6102758 0.3897242
## Pardnigr 0.3414144 0.6585856
## Pardpull 0.6306910 0.3693090
## Trocterr 0.6609695 0.3390305
## Zoraspin 0.6090685 0.3909315

Anova

## overall test
anova(spider.rda)
# By axes
anova(spider.rda, by = "axis")
# By covariate
anova(spider.rda, by = "margin")

Ordination plot

autoplot(spider.rda, geom = "text") +
  theme_bw()

R2

RsquareAdj(spider.rda)
## $r.squared
## [1] 0.6896099
## 
## $adj.r.squared
## [1] 0.635629

References

Smeenk-Enserink, Nellie, and P. J. M. VAN DER AART. 1974. “Correlations Between Distributions of Hunting Spiders (Lycosidae, Ctenidae) and Environmental Characteristics in a Dune Area.” Netherlands Journal of Zoology 25 (1): 1–45. https://doi.org/https://doi.org/10.1163/002829675X00119.